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ABSTRACT 

We consider how underused computing resources within an enter- 
prise may be harnessed to improve utilization and create an elastic 
computing infrastructure. Most current cloud provision involves a 
data center model, in which clusters of machines are dedicated to 
running cloud infrastructure software. We propose an additional 
model, the ad hoc cloud, in which infrastructure software is distri- 
buted over resources harvested from machines already in exis- 
tence within an enterprise. In contrast to the data center cloud 
model, resource levels are not established a priori, nor are re- 
sources dedicated exclusively to the cloud while in use. A partici- 
pating machine is not dedicated to the cloud, but has some other 
primary purpose such as running interactive processes for a par- 
ticular user. We outline the major implementation challenges and 
one approach to tackling them. 

1. INTRODUCTION 

Computational and storage resources within organizations are 
often under-utilized. This is likely to increase with further adop- 
tion of cloud services. A volunteer cloud infrastructure, support- 
ing what we term ad hoc cloud computing, would allow cloud 
services to run on existing heterogeneous hardware. 

If available, such infrastructure could improve organizations' 
resource utilization while offering some of the benefits of more 
conventional public and private clouds. This could yield signifi- 
cant cost savings. The model is analogous to volunteer computing 
as exemplified by Condor [21] and BOINC [22], although it poses 
considerable additional implementation challenges. 

In particular, we are interested in increasing utilization of general- 
purpose computers in offices and laboratories. As a motivating 
example, the (small) University of St Andrews operates in the 
region of ten thousand machines in offices and labs. In aggregate, 
their unused processing and storage capacity represent a major 
untapped computing resource. 

The recent Draft NIST Working Definition of Cloud Computing 
[20] defines both public and private cloud models. Both may be 
termed data center models, in which clusters of machines are 
dedicated to running cloud infrastructure software. We propose to 
introduce an additional deployment model, the ad hoc cloud, in 
which infrastructure software is distributed over resources har- 
vested from machines already in use. By ad hoc we mean that the 
set of machines comprising the cloud changes dynamically, as 
does the proportion of each machine's computational and storage 
resources that can be harnessed at a given point in time. Thus, in 
contrast to the data center cloud model, resource provisioning 
levels are not established a priori, nor are resources committed 
exclusively to the cloud while in use. A participating machine is 
not dedicated to the cloud, but has some other primary purpose 
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such as running interactive processes for a particular user, albeit 
often for a small proportion of the time. One of the most impor- 
tant research issues is how to reduce the impact of cloud opera- 
tions on such processes to an acceptable level. 

The availability of ad hoc clouds could yield various benefits to 
individual enterprises. Firstly, it could reduce the numbers of ma- 
chines that need to be purchased. Such costs are borne directly by 
enterprises employing private clouds, and indirectly by those us- 
ing external cloud providers 1 . 

The use of ad hoc clouds could also reduce the need for specia- 
lized infrastructure for resilience, such as redundant power and 
cooling systems, battery backup, etc. This represents 25% of data 
center costs [13]. Rather than ensuring resilience of a small num- 
ber of physical buildings, the grain of resilience could be ex- 
panded by using more widely distributed machines and tolerating 
individual building failures. 

Ad hoc clouds could reduce overall power consumption. One fac- 
tor is a reduction in the total number of machines required — 
significant since the energy cost of manufacture for a computer 
has been estimated as four times that used during its lifetime [23]. 
Another is that since machines comprising an ad hoc cloud infra- 
structure are situated in working spaces, the power consumed is 
partially offset (in temperate climates) by a reduction in the power 
required for heating. Conversely, machines are housed at lower 
densities than in data centers, so less active cooling is required. 

A similar idea, that of Nebulas, was proposed in [4]. Here we 
outline a specific approach to developing such ad hoc infrastruc- 
ture. Section 2 outlines requirements and the principal implemen- 
tation challenges; Section 3 surveys related work; Section 4 de- 
scribes our proposed approach to the problem. 

2. RELATED WORK 

The approach we describe can be compared and contrasted with 
grid and volunteer computing, and provider-specific clouds. 

Grid computing emerged principally to address requirements from 
e-Science, in which there was a growing need for software plat- 
forms that supported sharing of resources to support collaborative 
data analysis in computationally intensive science. Grid compu- 
ting provides facilities for the sharing of computational resources, 
often across administrative domains, with a view to enabling ef- 
fective collaboration between the owners of data or computational 
resources. To support such capabilities, grid toolkits (e.g. Globus 
[9]) provide core facilities that support operating system style 
functionalities such as file access, job execution and authentica- 
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tion, across heterogeneous platforms. These can be used to sup- 
port higher-level services such as distributed file systems (e.g. 
SRB [18]), workflow execution (e.g. Condor-G [12]) and 
workflow management (e.g. Pegasus [8]). Higher-level grid func- 
tionalities, such as abstract workflow specification in Pegasus, 
often make use of lower-level platforms (e.g. Pegasus uses Con- 
dor-G for managing dependencies between multiple jobs, which 
in turn uses Globus for job execution and file replica manage- 
ment). 

Grids have been a focus of considerable research, development 
and commercial activity for a decade, giving rise to a range of 
approaches and emphases. A significant portion of the work fo- 
cuses on connecting high-end, heterogeneous computational re- 
sources across multiple administrative domains, with a view to 
supporting virtual organisations, for example [11]. This emphasis 
has not been substantially changed by the move towards service- 
oriented grid architectures (e.g. [10]), in which resources are vir- 
tualised as web services, and thus grid functionalities are made 
available as part of a wider, service-oriented architecture. As such, 
the grid community has considerable experience in the develop- 
ment of techniques for providing abstractions over heterogeneous 
platforms. 

The cloud vision has elements in common with the objectives of 
grid computing, in particular a reduction in costs through resource 
sharing, and improvements in flexibility and reliability. However, 
different starting points have given rise to differing architectures 
and emphases. Broadly, grids have sought to support coordinated 
use of distributed resources for carrying out computationally in- 
tensive tasks for modest numbers of users, whereas clouds have 
focused on coordinated use of largely centralised resources for 
large numbers of less demanding requests from distributed users. 

Volunteer computing (VC), sometimes described as a desktop 
grid, uses individual users' machines to perform computationally 
intensive tasks. It is particularly suited for 'embarrassingly paral- 
lel' problems, e.g. SETI@home, one of a number of popular 
projects based on the BOINC framework [22]. Ad hoc clouds 
share the goal of 'stealing cycles' from user machines, but target 
more diverse applications. They can be viewed as offering the 
resource utilisation benefits of VC while avoiding the limitations 
of low or fluctuating volunteering rates, and providing the elastici- 
ty to workloads that make the cloud vision appealing. 

The Condor platform [21] also supports resource harvesting for 
highly parallel tasks. However, Condor is concerned with task 
scheduling whereas our approach targets a more general applica- 
tion-hosting model, in particular the support of interactive and 
data-centric applications. 

The best-known examples of Cloud computing, such as those of 
Amazon, Google, Yahoo! and Microsoft, have several aspects in 
common. For example, early clouds have been developed to sup- 
port scale-out: the execution of large numbers of typically con- 
strained requests over potentially huge data sets. This in turn has 
led to the development of simplified but scalable computational 
models, such as Google's MapReduce framework [6], which pro- 
vides a simple model for distributing highly parallelisable prob- 
lems over large machine clusters. The implementation abstracts 
over the details of distributing input data to individual machines 
and collecting results, and has been widely adopted by other cloud 
platforms, which often make use of Hadoop [3], an open-source 
implementation of the MapReduce model. MapReduce, in com- 



mon with early cloud data management platforms such as Ama- 
zon's Simple Storage Service (S3) and SimpleDB [2], and 
Google's Bigtable storage system [5], provides carefully con- 
strained capabilities. Google AppEngine also provides a con- 
strained model, specifically targeting web applications. 

At a lower level of abstraction, Amazon Elastic Compute Cloud 
(EC2) [2] allows an application to be structured as a set of poten- 
tially communicating virtual machine instances. The term 'elastic' 
refers to the flexibility with which instances may be created and 
discarded dynamically, allowing the computing resources allo- 
cated to applications to scale as required. 

Early support for cloud service developers, then, offers two dis- 
tinct styles: high-level APIs that significantly constrain service 
structure, and low-level machine virtualisation that gives almost 
complete freedom but provides little assistance with partitioning 
and managing the service across virtualised instances. 

From the perspective of the service provider, constrained service 
provision offers distinct benefits, as discussed for cloud data ser- 
vices in the Claremont Report on Database Research: 

"Early cloud data services offer an API that is much more re- 
stricted than that of traditional database systems, with a minimal- 
ist query language and limited consistency guarantees. This push- 
es more programming burden on developers, but allows cloud 
providers to build more predictable services, and to offer service 
level agreements that would be hard to provide for a full-function 
SQL data service. More work and experience will be needed on 
several fronts to explore the continuum between today's early 
cloud data services and more full-functioned but probably less 
predictable alternatives. " [1] 

More recently, Amazon and Microsoft have introduced full rela- 
tional database facilities, while the Windows Azure platform [17] 
offers a rather richer set of APIs to programmers. The cloud infra- 
structure remains targeted at dedicated servers. 

3. RESEARCH ISSUES 

Ad hoc clouds can be thought of as a generalization of public or 
private data center clouds, in which certain assumptions are re- 
laxed. These include the degree of homogeneity and availability 
of servers, and the presence of non-cloud processes on cloud 
hosts. Ad hoc clouds could host and coordinate services that are 
more diverse than those currently associated with high-level cloud 
APIs, while providing the service developer with richer support 
for service partitioning and management than machine virtualisa- 
tion approaches. They would operate over shared, heterogeneous 
resources, thus giving rise to requirements for more complex au- 
tomatic management and more sophisticated quality of service 
handling. 

We identify a resulting set of issues that would need to be ad- 
dressed: 

Core functionality: 

• What are the architectural requirements for an ad hoc cloud 
infrastructure? 

• What mechanisms are needed to allow convenient access to 
services, without single points of failure? 



• How can membership of the set of machines in an ad hoc 
cloud be controlled? In which situations should an ad hoc 
cloud be scaled out or contracted? 

• For some application classes, current cloud approaches scale 
well in stable environments — to what extent can these re- 
strictions be relaxed while retaining scalability? 

Automatic adaptation: 

• Can speculative plans for actions that might improve ad hoc 
cloud operation be generated automatically? 

• What techniques are needed in order to model ad hoc cloud 
behavior to enable useful estimates of the consequences of 
possible autonomic reconfiguration? 

• How should planning and modeling processes be coordi- 
nated? Where are they executed, and how are their resources 
allocated? 

• To what extent can planning decisions be improved using 
measurements and predictions of previous, current and fu- 
ture workloads? 

• What model calibration techniques are needed? In what sit- 
uations do phase changes in user behavior or the environ- 
ment cause a previously accurate model to diverge from re- 
ality, and how can this be handled? 

• How can the characteristics of a particular application be 
taken into account in determining how the ad hoc cloud 
adapts to support it? 

Quality of service: 

• To what extent can useful QoS guarantees be delivered to 
ad hoc cloud clients while limiting disruption for machine 
owners to acceptable levels? 

• In what way does the class of computation supported by a 
cloud influence the quality of service guarantees that can be 
provided? 

• What are the appropriate forms for expression of high-level 
policy goals, for an entire ad hoc cloud, and for specific ser- 
vices? 

• Can high-level goals be translated automatically to corres- 
ponding concrete actions? 

• How should measured low-level properties be aggregated 
for reporting in terms of high-level goals? 

• Under what circumstances can conflicting policies be de- 
tected and automatically resolved? 

• What mechanisms can be used to coordinate potentially 
complementary services (e.g. block storage, file systems, 
databases), so that they align with one another rather than 
competing unnecessarily for resources? 



4. OUR APPROACH 

An ad hoc cloud should be self-managing in terms of resilience, 
performance and balancing potentially conflicting policy goals. 
For resilience it should maintain service availability in the pres- 
ence of membership churn and failure. For performance it should 
be self-optimizing, taking account of quality of service require- 
ments. It should be acceptable to machine owners, by minimising 
intrusiveness and supporting appropriate security and trust me- 
chanisms. 

We identify several desirable features for the general ad hoc cloud 
architecture: 

• Agnostic as to service type: the approach can be applied to 
different styles of cloud service, for example infrastructure, 
platform or application as service [20]. 

• Agnostic as to means of control: the approach allows differ- 
ent forms of autonomic decision making to be deployed at 
different points in the architecture. 

• Agnostic as to grain of control: autonomic behaviour may 
be coarse or fine-grained at different points. For example, a 
dispatcher may balance load only as requests leave queues, 
or may change resource allocations for running jobs. 

4.1 Core Cloud Functionality 

There exist successful architectures for large-scale cloud compu- 
ting in the data center style [2, 3, 5, 6]. The principal additional 
challenges in supporting ad hoc clouds lie in accommodating 
highly dynamic machine membership, and allowing cloud compu- 
tations to co-exist satisfactorily with non-cloud processes. Al- 
though data center clouds deal with machine failures automatical- 
ly, the churn will be significantly higher in ad hoc clouds, arising 
from more frequent rebooting of personal machines and the fre- 
quent unavailability of portable devices. Machines may also be- 
come unavailable to the ad hoc cloud for unpredictable periods, 
even though they remain connected and functioning, due to the 
higher priority of fluctuating user workloads. Data center clouds 
do not support co-existence of cloud and user processes; an ad 
hoc cloud architecture must support monitoring of impact on user 
processes, rapid relocation or shut-down of cloud processes, and 
modelling of cloud computation to allow sensible initial place- 
ment of cloud processes. 

Here we sketch a possible architecture as a starting point. We 
define an ad hoc cloud as the union of a set of cloudlets, each of 
which provides a particular service or application. A cloudlet 
service may be specified and accessed via Web Services, or any 
other convenient protocol. Each cloudlet runs on a potentially 
dynamically changing set of physical machines. A given machine 
may host parts of multiple cloudlets. 



Figure 1. Cloudlets 

The software running on a particular machine, contributing to a 
particular cloudlet, is termed a cloud element. A cloudlet may be 
expanded or contracted by altering the number of machines, and 
hence cloud elements, assigned to it. The cloud elements compris- 
ing a given cloudlet communicate with one another to coordinate 
their activity. The cloud elements within a cloudlet may be, but 
need not be, homogeneous in terms of their functionality. This 
structure is illustrated in Figure 1 (although it may suggest that the 
cloud elements assigned to each particular cloudlet run on physi- 
cally close machines, this is for convenience of drawing only, and 
no such restriction is imposed by the architecture). 



1 Cloud Elements 

Each physical machine available to the ad hoc cloud may host a 
number of cloud elements, each assigned to a different cloudlet. 
The machine runs cloud infrastructure software, which supports 
secure creation, management and destruction of cloud elements. 
Finally, the machine also executes non-cloud processes for the 
primary user. This is shown in Figure 2. 

The cloud infrastructure contains an element manager for creating 
and destroying cloud elements. It also contains a model- 
ler/manager — which interacts with the host operating system in 
order to monitor effects of the local cloud elements on the user 
processes, and vice versa — and a broker and dispatcher, whose 
functions are described in the QoS section. 
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Figure 2. Node Structure 



Each cloud element running on a machine contains an engine 
capable of running the class of computations appropriate to its 
cloudlet. The engine in the diagram is labelled Wto signify that it 
can execute a particular class of workloads W, corresponding to 
the cloudlet functionality. An engine may provide application 
functionality directly via a user-level API, or support a further 
layer of application software loaded onto it. For example, one 
engine might provide a SQL API that accepts user queries direct- 
ly; other engines might provide MapReduce functionality, or Java 
or JavaScript interpreters. In all but the first case a corresponding 
program would also be loaded. Each engine runs on an abstract 
machine. This might be a VM ware- style virtual machine, a Java- 
style VM, or something else tailored to the target computation 
class. Whatever it is, it must provide sufficient isolation of the 
element from the non-cloud processes on the node. 

The cloud element also contains its own modeller/manager, which 
has knowledge of the semantics of W, and a cost model that al- 
lows reasoning about how the computation will be executed. The 
purpose of the model ler/manager is to control the operation of its 
associated engine such as to minimise disruption to user 
processes, to optimise its contribution to cloudlet functionality 
within such constraints, and to publish information to support 
effective deployment and adaptation of cloud elements. 

The infrastructure modeller/manager provides a conduit for com- 
munication between cloud element modeller/managers and the 
host OS. For example, it allows cloud element modeller/managers 
to be aware of the current resource demands of non-cloud user 
processes, and hence endeavour to avoid undue disruption. Com- 
munication between cloud element modeller/managers is neces- 
sary in order to coordinate operations across a cloudlet. 

4.2 Automatic Adaptation 

The cloud infrastructure should automatically minimize the costs 
of: 

• deploying, operating and evolving the ad hoc cloud so 
that it is both highly-available to ad hoc cloud users and 
non-disruptive to primary users 

• ensuring that the execution of applications is efficient 
and reliable, with high probability, over long periods 

This requires the ad hoc cloud to exhibit autonomic capabilities 
[14]. It should automatically adapt the extent of each cloudlet and 
the placement of its data and computation, driven by appropriate 
management policies. The requirement poses a significant scien- 
tific challenge, since it involves an understanding of the 
cost/benefit ratios associated with any given policy on resource 
allocation. Dynamic, multi-objective optimisation is necessary in 
order to coordinate resource utilisation policies, and to evolve 
policies in response to changing circumstances — about which 
knowledge is typically scarce. 

Because an ad hoc cloud runs on non-dedicated resources, har- 
vesting and harnessing activities must remain acceptably non- 
intrusive, requiring stringent constraints on policies and their ac- 
tions. Moreover, since the usage patterns of machine owners may 
vary widely, policies must be informed by efficient, scalable mon- 
itoring and performance modelling from which reliable, robust 
cosfbenefit ratios can be derived. 



4.3 Quality Of Service 

To deliver QoS guarantees, two separate services are required: to 
make policy decisions based on QoS negotiations with external 
parties; and to provide mechanism to implement policy decisions. 
We propose to use the Broker and Dispatcher patterns [16]. These 
are embodied as distributed services that are structured in the 
same way as user-level services, making them autonomic and 
capable of changing their behaviour and resource usage in re- 
sponse to changing request patterns. 

A broker establishes and manages up-front agreements with users 
of the cloud, for the provision of services at certain QoS levels for 
certain periods. It matches reservation requests against expected 
available resources in tandem with other commitments, and in- 
forms the requester whether or not the request can be satisfied. 
Where the broker reaches an agreement, it seeks to ensure that 
resources are pre-configured (e.g. with suitable service deploy- 
ments) in a way that enables the agreement to be met. 

The requirements of different types of service may need to be 
coordinated to meet QoS goals. The broker identifies whether the 
resource requirements and their associated constraints can be met. 
Such assessment requires access to information on the available 
computational resources, their historic loads and availabilities and 
other commitments that relate to them. A search must be made for 
a future configuration that is predicted to meet the requirements of 
the current request and future requests. 

A dispatcher exposes the user-level cloud services. Where a re- 
quest is made to access a service for which there is an established 
agreement, the dispatcher makes use of the resources reserved by 
the broker in anticipation of the request. Where there is no pre- 
arranged agreement, the request is still directed to a relevant ser- 
vice by the dispatcher on a best-effort basis. 

5. EXAMPLE 

As an example of this architecture we describe the H20 database 
system [15]. H20 is a relational database based on the open 
source H2 system [19] that is intended for deployment on an ad 
hoc cloud. It offers a full set of relational operations including a 
user interface and JDBC linkage. Unlike traditional desktop rela- 
tional systems such as MySQL, the system resides within hetero- 
geneous desktop systems hosted within an enterprise. To under- 
stand how the H20 system maps onto the architecture described 
in this paper, two components of H20 must be considered sepa- 
rately: those corresponding to the cloudlet, and those correspond- 
ing to the cloud elements. 

The cloud elements each run a complete relational database en- 
gine which is responsible for whole or partial relational tables 
from the database. This corresponds to the engine component 
shown in Figure 2. A Java abstract machine, augmented by a re- 
stricted interface to local persistent storage, hosts each database 
engine. The modeller/manager element tracks the amount of per- 
sistent storage used on the node and the bandwidth usage etc. The 
programs that are executed by the engine are SQL fragments that 
are delivered to the node either from a local user interface or sent 
by another cloud element. In addition to the network interface 
exporting SQL functionality, a Java RJVII interface provides func- 
tionality for configuring both individual cloud elements and dep- 
loyments on each node. 

The cloudlet component is required to track the individual ele- 
ments of the database and maintain the database metadata. For 



example, individual relations are autonomically replicated across 
multiple cloud elements for resilience. Consequently, when a 
query attempts to access a relation for the first time or following a 
failure, it must query the cloudlet to bind to the managers running 
within an individual cloud element. This functionality is achieved 
by running a distributed database manager in the cloudlet above a 
P2P infrastructure, with each of the individual components of the 
cloudlet being hosted by cloud elements. 

The cloud infrastructure on each node provides the ability to in- 
stantiate cloud elements on individual nodes. We have developed 
technology in earlier systems [7] that permits securely signed 
bundles of code and data to be instantiated on machines. 

6. CONCLUSIONS 

The ad hoc cloud model could allow complex cloud-style applica- 
tions to exploit untapped resources on non-dedicated hardware. 
We believe that this approach has the potential to: 

• enable organizations to reduce IT costs; 

• enable organizations to obtain the benefits of cloud 
computing in new application areas; 

• reduce net energy consumption by IT activities. 

We have outlined a case for ad hoc cloud computing, a set of re- 
sulting research challenges, and as a starting point, a proposed 
architecture. 
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